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Abstract 

Breast cancer is one of the most common cancers among the women around the world. Several genes are known to be 
responsible for conferring the susceptibility to breast cancer. Among them, TP53 Is one of the major genetic risk factor 
which is known to be mutated in many of the breast tumor types. TP53 mutations In breast cancer are known to be related 
to a poor prognosis and chemo resistance. This renders them as a promising molecular target for the treatment of breast 
cancer. In this study, we present a computational based screening and molecular dynamic simulation of breast cancer 
associated deleterious non-synonymous single nucleotide polymorphisms In TP53. We have predicted three deleterious 
coding non-synonymous single nucleotide polymorphisms rsl 1540654 (R110P), rsl 7849781 (P278A) and rs28934874 
(P151T) in TP53 with a phenotype in breast tumors using computational tools SIFT, Polyphen-2 and MutDB. We have 
performed molecular dynamics simulations to study the structural and dynamic effects of these TP53 mutations In 
comparison to the wild-type protein. Results from our simulations revealed a detailed consequence of the mutations on the 
p53 DNA-binding core domain that may provide insight for therapeutic approaches in breast cancer. 
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Introduction 

One of the common malignancies and leading causes of cancer 
death faced by women around the world is breast cancer. 
Globally, the death rate of breast cancer has been rising around 
2.5 lakhs in 1980 to 4.25 lakhs in 2010 [1]. Even in countries hke 
China and India, its incidence is increased around 30% over the 
last decade whereas in Japan, Korea and Singapore it was doubled 
or even tripled [2]. According to National Cancer Institute (USA) 
statistics, estimated new cases of breast cancer in United States for 
the year 2013 is 232,340 in female and 2,240 in male whereas 
estimated breast cancer deaths are 39,620 in female and 410 in 
male. Some of the common risk factors for breast cancer can be 
broadly categorized into two types i.e., genetic and non genetic. 
Among these two risk factors, genetic risk factors constitute 5-10% 
of the breast cancer cases. Studies showed that fifty one variants in 
40 genes are significantly associated with breast cancer risk and 
among them variants in six genes i.e., BRCAl, BRCA2, TP53, 
PTEN, STKll and CDHl show strong association whereas 
variants in four genes i.e., ATM, CHEK2, BRIPl, PALB2 show 
moderate association and approximately 20 variants in other genes 
show weak association [3,4]. 



Among the genes conferring high breast cancer risk, TP53 is 
known to be mutated in 30% of the breast cancers cases with a 
higher frequency in some tumor subtypes [5]. TP53 encodes p53, 
which is one of the most important tumor suppressor proteins in 
human cancers. p53 is a multi domain protein with 393 residues 
containing i) an acidic N-terminal transcription activation domain 
(1-44); ii) a proline-rich regulatory domain (62-94); iii) a central 
sequence-specific well conserved DNA-binding domain (1 10-292); 
iv) an oligomerization domain (325-363) and v) a C-terminal 
domain containing multiple regulatory signals (363-393) [6]. It 
functions as a tetramer. It is reported that 75% of all the mutations 
in TP53 are missense, resulting in the substitution of a single 
amino acid with another and these mutations are predominantly 
distributed in the exons 4—9, encoding the DNA-binding domain 
of the protein [7]. 

Understanding how these single nucleotide polymorphisms 
(SNPs) affect the function of proteins is an important area of 
research and an efficient identification of such SNPs would be 
useful for SNP selection in genetic studies to understand the 
molecular basis of disease and predicting the effects of in vitro and 
in vivo mutagenesis experiments [8]. Among the SNPs, non- 
synonymous coding SNPs (nSNPs) are the one which are located 
in the coding regions resulting in an amino acid variation in the 
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Figure 1. Ligplot showing the interactions of metal ion (Zn) with the amino acid residues of the protein. An atom of Zn bound with a 
tetra-co-ordinate geometry to three Cysteines (Cys 176, 238 and 242) and one Histidine (His 179). 
doi:l 0.1 371 /journal.pone.Ol 04242.g001 



protein products of genes. They are believed to have a high impact 
on the phenotype [9] . In the present study, we have focused on the 
nSNPs in the coding region of TP53 gene having an impact on 
breast cancer phenotype. We have explored tlie possible relation- 
ship between genetic mutation and phenotypic variation using 
different computational algorithm tools SIFT, PolyPhen-2 and 
Mutdb for prioritizing the deleterious breast cancer associated 
nSNPs from dbSNP datasets. 

Molecular dynamics simulation (MDS) on the other hand is an 
important tool for understanding the effect of mutations on the 
protein structure, as it provides the information about the protein 
at atomic level on a reasonable time scale. Previously, several 



studies have utilized molecular dynamics to analyze the impact of 
mutations on TP53 [10-14]. In order to check (i) whether the 
three mutants (Rl lOP, P151T and P278A) have an impact on the 
conformation in the functionally significant regions of the p53C? 
(ii) Whether the mutant structures are deviating from the native 
p53C? (iii) Whether the mutants are changing the flexibility of the 
p53C? we have performed molecular dynamics simulations of WT 
and three mutants. Since, a significant fraction of p53 appears in 
apo state at physiological temperature and insufficient zinc is 
linked to misfolding, particularly in tumorigenic mutations, we 
performed both apo (Zinc-free state) and holo simulations and are 
presented here. Results showed that three mutants Rl lOP, P151T 
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Figure 2. Distribution of TP53 co6\ng nonsynonymous SNPs (nSNPs), coding synonymous SNPs (sSNPs), 3' UTR SNPs, 5' UTR SNPs 
and intronic SNPs. 
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Figure 3. Backbone rmsd values for WT, R110P, P151T and 
P278A during A] Apo simulations B) Holo simulations for p53C. 

Black: WT, red: R110P, green: P151T and blue: P278A. 
doi:10.1371/journal.pone.0104242.g003 

and P278A are known to confer deleterious effect in the p53 DNA- 
binding core domain region (p53C). Overall, the objective of our 
study is to predict the breast cancer associated nSNPs and to 
fiirther reveal the conformational flexibility of mutated apo and 
holo p53C through extensive molecular dynamic simulation. 

Materials and Methods 

Datasets 

TP53 SNPs were retrieved from dbSNP database (http:/ /www. 
ncbi.nlm.nih.gov/projects/SNP/, Build 138; access date: May 13, 
2013) [15] for our computational analysis. 

Prediction of deleterious coding nSNPs 

We used both SIFT and Polyphen-2 to screen out the 
deleterious coding nSNPs from other SNPs for TP 53. 'Sorting 
Tolerant From Intolerant' (SIFT) (http://siftjcvi.org/; access 
date: May 15, 2013) is a multi-step algorithm that uses a sequence 
homology based approach [16] to predict whether an amino acid 
substitution in a protein will affect the protein function or not. For 
a given protein sequence, SIFT chooses the related proteins and 
obtains an alignment of them with the query and assigns scores to 
each residue. Scores ranging from 0-0.05 are considered to be 
intolerant or deleterious amino acid substitutions, where as scores 
ranging from 0.05-1 are considered be tolerant or neutral [17,18]. 
We submitted our query in the form of dbSNP id to SIFT server. 
PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/; access date: 
May 18, 2013) [19] on the other hand is a web based server that 



predicts the functional significance of an allele replacement from 
its individual features by Naive Bayes classifier. PolyPhen-2 
prediction models were tested and trained using two pairs of 
datasets, one is HumanDiv compiled from all damaging alleles 
with known effects on the molecular function causing human 
Mendelian diseases, present in the UniProtKB database, together 
with differences between human proteins and their closely related 
mammalian homologs, assumed to be non-damaging and the 
other is HumVar, consisted of all human disease-causing 
mutations from UniProtKB, together with common human 
nsSNPs (MAF> 1 %) without annotated involvement in disease, 
which were treated as non-damaging. A mutation is appraised 
qualitatively, as benign, possibly or probably damaging based on 
pairs of false positive rate (FPR) thresholds, optimized separately 
for each model. Query was submitted in the form of dbSNP id to 
WHESS.db: a quick access to precomputed set of PolyPhen-2 
predictions for whole human exome sequence space (WHESS). 

Prediction of phenotypic consequence for deleterious 
coding nSNPs 

The phenotypic consequences of deleterious coding nSNPs were 
predicted using MutDB (http://www.mutdb.org/cgi-bin/mutdb. 
pi; access date: May 20, 2013), a tool that integrates publicly 
available databases of human genetic variation with molecular 
features and clinical phenotype data [20]. Gene symbol {'TP53') 
was used as a search term. 

Modeling nSNPs locations on protein structure and 
Molecular dynamics 

To investigate the mechanism of structural consequences of the 
mutations on TP53 we performed molecular dynamics. Initial 
coordinates were extracted from the crystal structure of p53 core 
domain in the absence of DNA (PDB ID: 2ocj, chain A; resolution 
2.05 A°) [21]. All water molecules were removed from the crystal 
structure and the mutants (MTs) RllOP, P151T, P278A were 
created by replacing the wild-type (WT) protein residue with its 
polymorphic residue using PyMOL [22]. Molecular dynamic 
analysis was performed at 37°C (physiological temperature) and 
neutral pH using GROMACS 4.5.3 (http:/ /www.gromacs.org/) 
[23-25]. The p53 core domain contains Zn^""" that is essential for 
activity. A LIGPLOT [26] scheme of Zn^""" interaction in the 
crystal structure of p53 core domain (PDB ID: 2ocj, chain A; 
resolution 2.05 A°) was shown in the Fig. 1 given below. Zn^"*" 
remains bound to p53 core domain at temperatures below 30°C 
and it rapidly dissociates at physiological temperature such that a 
significant fraction appears in the apo state [27] . Consequently, we 
focused on both apo and holo simulations of wild (WT) and 
mutant type (MT) p53 core domain and presented here. The 
system was solvated by adding explicit flexible SPC water [28] 
embedded in a cubic box and the walls were located >10 A from 
all protein atoms. Cl~ counter ions (5, 3, 3, 5, 3, 5, 2 and 4 for holo 
WT, apo WT, apo P151T, holo P151T, apo P278A, holo P278A, 
apo Rl lOP and holo Rl lOP respectively) were added to neutralize 
the total charge of the system. The box size was set to 
4.833 nmx4.027 nmx4.794 nm with box vectors 
7.3 nmx7.3 nmx7.3 nm and box angles 90" for each side. Each 
solvated structure was energy minimized for 50000 steps of 
steepest descent minimization terminating when maximum force is 
found smaller than 1000 KJ/mol ' / nm ' . After energy minimi- 
zation, the system was subject to equilibration at constant 
temperature (300K) and pressure (1 bar) with a time step of 2 fs 
and non bonded pair hst updated every five steps under the 
conditions of position restraints for heavy atoms and LINGS 
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Table 1. Prediction scores found to be functionally significant by SIFT server. 





dbSNPID 


Nucleotide Chsnge 


Amino acid 
chanc|6 


Protein ID 


Tolerance index 












Usinc) ortlioloc|ues in tlie 
Protein alignment 


Usinc) liomoloQues in the 
Protein alignment 


rsl 042522 


C/G 


P72R 


NP_000537 


0.26 


0.80 


rsl 642789 


A/T 


C339S 


NP_001119585 


0.05 


0.00 


rsl 800371 


C/T 


P47S 


NP_000537 


0.49 


0.06 


rs2287499 


C/G 


R68G 


NP_001 1 37462 


0.55 


0.28 


rs3021068 


A/T 


C341S 


NP_001 119586 


0.00 


0.00 


rs1 1540652 


A/G 


R248Q 


NP_000537 


0.00 


0.04 


rsl 1540654 


C/G/T 


R110P 


NP_000537 


0.04 


0.00 


rsl 7849781 


C/G 


P278A 


NP_000537 


0.00 


0.05 


rsl 7880282 


A/G 


PUS 


NP_001 137462 


0.00 


0.00 


rsl 7881 470 


G/T 


S366A 


NP_000537 


0.77 


0.01 


rsl 7882252 


A/G 


E339K 


NP_000537 


0.15 


0.00 


rs28934571 


G/T 


R249S 


NP_000537 


0.00 


0.00 


rs28934573 


C/T 


S241F 


NP_000537 


0.00 


0.00 


rs28934574 


C/T 


R282W 


NP_000537 


0.00 


0.00 


rs28934575 


A/G/T 


G245S 


NP_000537 


0.05 


0.00 


rs28934576 


A/G 


R273H 


NP_000537 


0.00 


0.03 


rs28934577 


A/T 


L257Q 


NP_000537 


0.00 


0.00 


rs28934578 


A/G 


R175H 


NP_000537 


0.00 


0.00 


rs28934873 


C/T 


M133T 


NP_000537 


0.00 


0.00 


rs28934874 


A/C/T 


P1S1T 


NP_000537 


0.02 


0.00 


rs28934875 


C/G 


A138P 


NP_000537 


0.08 


0.00 


rs34067256 


C/G 


P136R 


NP_001 137462 


0.31 


0.93 


rs35163653 


A/G 


V217M 


NP_000537 


0.06 


0.00 


rs35993958 


C/G 


G360A 


NP_000537 


0.76 


0.27 


rs55819519 


C/T 


R290H 


NP_000537 


0.29 


0.09 


rs55832599 


A/G 


R267W 


NP_000537 


0.00 


0.00 


rs561 84981 


C/T 


N311S 


NP_000537 


0.74 


0.39 


rs72661117 


A/G 


D184N 


NP_000537 


0.25 


0.00 


rs72661119 


A/G 


N263D 


NP_000537 


0.30 


0.01 


rsSOl 84930 


A/G 


S378P 


NP_000537 


0.10 


0.00 


rsl 11897235 


A/G 


A59T 


NP_001 137462 


0.24 


0.02 


rsll 2431 538 


C/T 


E285K 


NP_000537 


0.01 


0.00 


rs121913343 


C/T 


R273C 


NP_000537 


0.00 


0.00 



doi:l 0.1 371 /journal.pone.Ol 04242.t001 



constraints [29] for all bonds. The temperature was kept constant 
using a Berendsen thermostat [30]. Electrostatic interactions were 
calculated using the particle mesh Ewald summation method [31]. 
FinaUy, eight (i.e., apo WT, holo WT, apo RllOP, holo RllOP, 
apo P151T, holo PI 5 IT, apo P278A and holo P278A respectively) 
10 ns Molecular dynamics simulations (MDS) were performed. 

Analysis of Molecular dynamics trajectories 

Comparative analysis of structural deviations in native and 
mutant structure such as root mean-square deviation (RMSD), 
root mean-square fluctuation (RMSF), solvent-accessible surface 
area (SASA), secondary structure calculation etc., were computed 
using g_rms, g_rmsf, g_sas and g_gyrate built in functions of 



GROIvIACS package. The average number of protein-solvent 
intermolecular hydrogen bonds was computed and analyzed using 
g_hbond. A cutoff radius of 0.35 nm was employed between the 
donor and the acceptor. Density map was plotted using 
g_densmap whereas average values of simulation output data 
was plotted using g_analyze of GROMACS respectively. Graphs 
were plotted using GRACE software (http:/ /plasma-gate. 
weizmann.ac.il/ Grace/). 

Principal component analysis 

To analyze and visualize the overall motions in the simulations, 
essential dynamics (ED) was carried out according to the protocol 
in GROMACS software package [32]. Covariance analysis, also 
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called principle component analysis or ED extracts the correlated 
motions of proteins to miderstand the motions that are most 
fundamental to its activity. ED is generally employed to 
characterize the large scale collective motions of a protein. 
Covariance matrices of both WT and MT simulations were 
constructed using the Cot atoms trajectory as it has been shown 
that they contain all the information for reasonable description of 
the protein large concerted motion [32]. We used gromacs utilities 
g_covar and g_anaeig for analyzing the trajectories. 

Results and Discussion 

SNP data set from dbSNP 

dbSNP database contain a total of 14613 SNPs for TP53 gene, 
out of which 637 were found to be Human (active) SNPs (i.e., 
Active Human RS and not including those that have been 
merged). Among the 637 Human (active) SNPs, 100 were coding 
nSNPs, 31 were coding synonymous, 52 SNPs were in the mRNA 
3' UTR region, 38 were in the mRNA 5' UTR region and 451 
were in the intronic regions. It can be seen from the Fig. 2 that the 
vast majority of SNPs occur in the intronic region (70.8%) and 
more SNPs are nSNPs (15.6%) compared to synonymous SNPs 
(4.8%), SNPs occurring in the mRNA 3' UTR (8. 1 %) and 5' UTR 
(5.9%) regions. We selected coding nSNPs for our investigation. 

Deleterious nSNPs by SIFT server 

Among the 100 nSNPs, from dbSNP 18 were found to be 
deleterious, with a tolerance index score of less than or equal to 
0.05. We observed that among 18 deleterious nSNPs, 11 had a 
highly deleterious tolerance index score of 0.00 using orthologues 
and homologues in the protein alignment. Remaining 7 delete- 
rious nSNPs had a tolerance index score of 0.01, 0.02, 0.03, 0.04 
and 0.05 using orthologues and homologues in the protein 
alignment respectively (Table 1). Among 18 nSNPs that are 
predicted to be deleterious, three nSNPs showed a nucleotide 
change of A/T, five showed a change of A/ G, one showed a 
change of C/G, five showed a change of C/T, one showed a 
change of G/T, one showed a change of C/G/T, one showed a 
change of A/G/T and one showed a change of A/C/T 
respectively. Compared to other nucleotide changes C/T and 
A/T change occurred the maximum number of times. Amino acid 
change on the other hand was majorly found to be from special 
amino acids to polar amino acids with uncharged R groups. 
Among them eight showed a change at the region of Arginine 
residues, three showed a change at the region of Proline residues, 
two showed a change at the region of Cysteine and the remaining 
five showed changes at the regions of Serine, Glycine, Leucine, 
Methionine, Glutamic Acid (Table 1). 

Damaged nSNPs by PolyPhen-2 Server 

To predict the functional significance of an allele replacement, 
100 nSNPs analyzed by SIFT were submitted to PolyPhen-2 
server. PolyPhen-2 qualitatively predicts whether a mutation is 
benign, possibly damaging, or probably damaging using two 
Bayesian probabilistic models, HumDiv and HumVar. For 
Mendelian disease diagnostics, the HumVar model is recom- 
mended as it should distinguish mutations with drastic effects from 
normal human variation whereas the HumDiv model is recom- 
mended for identifying variants where even mildly deleterious 
alleles are treated as damaging [19]. Among 100 nSNPs from 
dbSNP submitted to Polyphen-2, 37 were found to be possibly 
damaging, or probably damaging by both HumDiv and HumVar 
predictions whereas 2 were predicted as possibly damaging by only 
HumDiv, 36 were predicted as benign by both HumDiv and 
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Table 3. Phenotype information of TP53 variants. 





dbSNPID 


Nucleotide change 


Amino acid change 


Prediction 












SIFT 


Polyphen-2 


Phenotype (Mutdb) 


rsl 1540652 


A/G 


R248Q 


Intolerant 


Probably damaging 


In LFS and many types of tumors 


rs1 1540654 


C/G/T 


R110P 


intolerant 


Probably damaging 


In a breast tumor 


rsl 7849781 


C/G 


P278A 


Potentially intolerant 


Probably damaging 


In a breast tumor 


rs28934571 


G/T 


R249S 


Intolerant 


Probably damaging 


In many types of tumors 


rs28934573 


C/T 


S241F 


Intolerant 


Probably damaging 


In a colon tumor 


rs28934574 


C/T 


R282W 


Intolerant 


Probably damaging 


In esophageal adeno carcinoma and 
many types of tumors 


r528934575 


A/G/T 


G245S 


Potentially intolerant 


Probably damaging 


In esophageal adeno carcinoma and 
many types of tumors 


rs28934576 


A/G 


R273H 


Intolerant 


Possibly damaging 


In LFS, colon and esophagus tumors 


rs28934577 


A/T 


L257Q 


Intolerant 


Probably damaging 


Nil 


rs28934874 


A/C/T 


P1S1T 


Intolerant 


Possibly damaging 


In a breast tumor 


rs28934875 


C/G 


A138P 


Potentially intolerant 


Probably damaging 


In a lung tumor 


rs351 63653 


A/G 


V217M 


Potentially intolerant 


Possibly damaging 


Nil 


rs55832599 


A/G 


R267W 


Intolerant 


Probably damaging 


Nil 


rsl 12431538 


C/T 


E285K 


Intolerant 


Probably damaging 


Nil 


rsl 21 91 3343 


C/T 


R273C 


Intolerant 


Probably damaging 


Nil 



doi:l 0.1 371/journal.pone.Ol 04242.t003 



HumVar predictions and the remaining 25 were not scored as 
shown in the Table 2. Only SNPs that are predicted as possibly 
damaging or probably damaging by both HumDiv and HumVar 
predictions were considered for our study. 

Breast Cancer related mutations by Mutdb database 

Results from both SIFT and Polyphen-2 analysis showed that 
among 100 nSNPs, only 15 SNPs were predicted to be deleterious 
or damaging on protein function. These 15 SNPs were submitted 
to Mutdb to confirm that they confer a breast cancer phenotype or 
not. Results showed that 3 SNP mutations i.e., rsl 1540654 
(Rl lOP), rsl7849781 (P278A) and rs28934874 (P151T) are known 
to have a phenotype in Breast tumors (Table 3). 



Molecular dynamics simulation studies 

Results from the calculations of RMSD for backbone and Ca 
atoms, root mean square fluctuation (RMSF) for Cot atoms, radius 
of gyration (Rg) for Ca atoms and protein for apo and holo WT, 
RllOP, P151T, P278A MDS were presented in the Table 4. To 
analyze the impact of MTs on the p53C, we have examined the 
RMSD values. The calculated RMSDs of the backbone atoms in 
apo and holo WT, RllOP, PI 5 IT, P278A with respect to the 
starting structure during the 1 0-ns MDS as a function of time were 
plotted in the Fig. 3a, b. During the apo simulations, backbone 
RMSDs of the WT and MT structures showed a sharp increase in 
the first 3 ns followed by equilibrium around 6 ns and a sudden 
decrease around 7.5 ns for P151T, 9.5 ns for P278A and RllOP 
(Fig. 3a) whereas in the case of holo simulations a different pattern 
was observed. A sharp increase is shown during the first 3.5 ns 



Table 4. Time averaged structural properties calculated for WT, Rl lOP, P151T, P278A holo [with Zn^^ ion present] and apo [with 
Zn^* Ion absent] p53 core domain. 







initial WT 




R110P 




P151T 




P278A 






apo 


holo 


apo 


holo 


apo 


holo 


apo 


holo 


Backbone rmsd (nm) 


0.27 (0.04) 


0.24 (0.02) 


0.29 (0.03) 


0.26 (0.04) 


0.24 (0.04) 


0.21 (0.03) 


0.24 (0.05) 


0.26 (0.05) 


Ca-rmsd (nm) 


0.28 (0.04) 


0.25 (0.03) 


0.30 (0.03) 


0.27 (0.04) 


0.25 (0.04) 


0.22 (0.03) 


0.24 (0.05) 


0.26 (0.05) 


Ca-rmsf (nm) 


0.15 (0.08) 


0.12 (0.07) 


0.13 (0.08) 


0.13 (0.07) 


0.13 (0.07) 


0.12 (0.06) 


0.14 (0.08) 


0.14 (0.09) 


Rg- Ca (nm) 


1.64 (0.01) 


1.64 (0.01) 


1.62 (0.09) 


1.63 (0.08) 


1 .64 (0.08) 


1.63 (0.07) 


1.63 (0.01) 


1.64 (0.01) 


Rg-protein (nm) 


1.67 (0.01) 


1.67 (0.01) 


1.65 (0.01) 


1 .66 (0.08) 


1 .66 (0.08) 


1 .66 (0.08) 


1.66 (0.01) 


1.66 (0.0) 


Trace of the diagonalized 
covariance matrix (nm^) 


5.67543 


4.51117 


4.86401 


4.81 546 


4.55228 


4.11307 


5.70146 


5.5473 



Mean values — averaged over the trajectory or over the number of residues employed at each calculation — with standard deviations given in parentheses. Coc-rmsd: Ca- 
root-mean-square deviation, Rg: Radius of gyration; SASA: Solvent Accessible Surface Area 
doi:1 0.1 371 /journal.pone.Ol 04242.t004 
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Figure 4. RMSD and DSSP changes in WT and IVIT structures during the 10-ns apo iVIDS. A) Figure shown at the top represents WT and 
R110P DSSP plot. In the middle superimposed WT and R110P structures are shown. Yellow: WT, red: R110P. Atthe bottom, the Ca RMSD plot is shown 
as a function of time. Black: WT, red: R1 10P. B) Figure shown at the top represents WT and P151T DSSP plot. In the middle superimposed WT and 
P151T structures are shown. Yellow: WT, green: P151T. At the bottom, the Cot RMSD plot is shown as a function of time. Black: WT, green: P151T. C) 
Figure shown at the top represents WT and P278A DSSP plot. In the middle superimposed WT and P278A structures are shown. Yellow: WT, blue: 
P278A. At the bottom, the Ca RMSD plot is shown as a function of time. Black: WT, blue: P278A. 
doi:1 0.1 371/journal.pone.01 04242.g004 
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Figure 5. RMSD and DSSP changes in WT and MT structures during the 10 -ns holo MDS. A) Figure shown at tine top represents WT and 
R1 1 0P DSSP plot. In the middle superimposed WT and R1 1 0P structures are shown. Yellow: WT, red: R1 1 0P. At the bottom, the Cct RMSD plot is shown 
as a function of time. Black: WT, red: R1 10P. B) Figure shown at the top represents WT and P151T DSSP plot. In the middle superimposed WT and 
P151T structures are shown. Yellow: WT, green: P151T. At the bottom, the Coc RMSD plot is shown as a function of time. Black: WT, green: P151T. C) 
Figure shown at the top represents WT and P278A DSSP plot. In the middle superimposed WT and P278A structures are shown. Yellow: WT, blue: 
P278A. At the bottom, the Ca RMSD plot is shown as a function of time. Black: WT, blue: P278A. 
doi:10.1371/journal.pone.0104242.g005 




Table 5. Percentage wise distribution of Ca RMSF values. 





RMSF (nm) 








All Ca 




Sec-str Ca 




<0.1 nm 


>0.1 nm 


<0.1 nm 


>1 nm 


WT apo 


34 


66 


26 


20 


WT holo 


47 


52 


34 


13 


RllOP apo 


47 


52 


31 


9 


RllOP holo 


38 


61 


27 


18 


P151T apo 


46 


53 


32 


12 


P151T holo 


44 


55 


29 


15 


P278A apo 


35 


64 


27 


17 


P278A holo 


41 


58 


29 


15 



doi:l 0.1 371 /journal.pone.Ol 04242.t005 
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Table 6. Regions (a-helices, p-sheets, and loops) showing an average increase or decrease of RMSF in the MTs compared to the 
WT; RMSF of a particular structure is taken to be Increased or decreased if there Is an average change in RMSF of >0.03 nm in at 
least >50% of its residues. 






Increase 


Decrease 


RllOP apo 


L4, S4, Lg, L7, Lio, 


L3' Ls, Li 2, 


RllOP holo 


Li' St, L3, S2, L4, L7, L10, L13, 


H2, L^, 


P151T apo 


1-4' 1-7' Hi' 


L3' Ls, 


P151T holo 


Li' L2, L4, L7, Lio, Lii, 


L3, Ls, Li 2, 


P278A apo 


L3' L4' L7, H2, Lio, 


LvLa, 


P278A holo 


Li' L4, L7, Lio, L11, 


L3, Ls, 


doi:l 0.1 371/journal.pone.Ol 04242.1006 



followed by equilibrium around 4 ns and a sudden increase 
around 7.5 ns for PI 5 IT, P278A and a sudden decrease around 
9.5 ns for RllOP. A comparison of average backbone RMSD 
values showed the following order of structural deviations 
(Table 4): apo; RllOP > WT > P151T = P278A, holo; 
RllOP = P278A > WT >P151T. A variation in the average 
backbone RMSD values of WT and MTs lead to the conclusion 
that these mutations could affect the dynamic behavior of p53C, 
thus provides a suitable basis for further analyses. 

Since, RMSD of the Ca atoms is a central origin to compute the 
protein system [33], we have calculated the respective Ca RMSDs 
for both apo and holo simulations and plotted in the Fig. 4, 5. 
During the apo simulations, Ca-RMSD of Rl lOP showed a sharp 
increase in the initial 2.5 ns followed by equilibrium around 4 ns 
and a sudden decrease after 9 ns (Fig. 4a). However, PI 5 IT and 
P278A showed a different trend of Ca-RMSD, with an 
equilibrium around 4 ns and a sudden decrease around 7 ns for 
PI 5 IT (Fig. 4b) whereas an equilibrium around 6 ns and a sudden 
decrease around 9 ns for P278A (Fig. 4c). During holo simulations 
on the other hand, Ca-RMSD of Rl lOP showed a less variation in 
the initial 2 ns followed by equilibrium around 4 ns and a sudden 
decrease after 5 ns (Fig. 5a). However, P151T and P278A showed 
a different trend of Ca-RMSD, with an equilibrium around 5 ns 
and a sudden decrease around 9.5 ns for P151T (Fig. 5b) whereas 
an equilibrium around 4 ns and a sudden increase around 7 ns for 
P278A (Fig. 5c). A comparison of average Ca-RMSD values 
showed the following order of structural deviations (Table 4): apo; 
RllOP > WT > P151T > P278A, holo; RllOP > P278A > WT 
> PI 5 IT. These results indicate that a greatest change was 
observed in the R 1 1 OP compared to the other mutants in both apo 
and holo simulations. 

In order to analyze the change in secondary structure patterns 
in WT and MTs, we applied the software tool DSSP (Database of 
Secondary Structure in Proteins) by Kabsch and Sander [34], 
which employs H-bonding patterns and various other geometrical 



features to assign secondary structure labels to the residues of a 
protein. We have plotted the secondary structure patterns between 
WT and MTs of both apo and holo simulations and also 
superimposed their respective structures at the beginning of the 
simulation and for specific time steps where the conformational 
drifts occurred at a higher range (Fig. 4,5). Analysis of time 
dependent secondary structure fluctuations through DSSP analysis 
showed a conformational drift from fi-sheets to bend form between 
the residues 165-175 in Rl lOP and a-helix to bend form for the 
180* residue in P151T and P278A during the apo simulations and 
a conformational drift from a-helix to bend for the 1 80* residue in 
RllOP, PI 5 IT and turn to bend form for the 130* residue in 
P278A during die holo simulations (Fig. 4, 5). The conformational 
changes support our previous results obtained from RMSD 
analysis that major change occurred in the Rl lOP (Fig. 2,4,5). 

In order to understand how the mutants affect the dynamic 
behaviour of the residues and to examine the cause of 
conformational drifts observed in RMSD and secondary structure 
patterns, Ca-root mean square fluctuation (Ca-RMSF) of WT and 
MT amino acid residues were calculated and plotted in the 
Fig. 6a-d. Except P 1 5 1 T, in all cases the MT holo simulations had 
higher average Ca-RMSFs than the WT holo simulations (Fig. 6a) 
(Table 4). In the apo and holo WT, more than 50% of the residues 
have RMSF values >0. 1 nm (Table 5) indicating a higher level of 
fluctuation. During the holo simulations, all the MTs showed a 
larger percentage of residues (i.e., Ca residues and residues in the 
protein core comprising secondary structural elements) with 
RMSF values >0.1 nm whereas during apo simulations less 
percentage of residues showed RMSF values >0. 1 nm compared 
to the WT. These results indicate that compared to apo, holo 
simulations are associated with increase in flexibilities in MTs. 
Among the holo MTs, Rl lOP have a higher percentage of residues 
with RMSF >0. 1 nm thus indicating a higher effect on the overall 
flexibility of the p53C. 



Table 7. Ca RMSF values (nm) at Zn^+ binding residues, CI 76, HI 79, C238 and C242. 



Residue 


WT holo 


WT apo 


R110 apo 


R110 holo 


P151T apo 


P151T holo 


P278A apo 


P278A holo 


C176 


0.1452 


0.1598 


0.1021 


0.1056 


0.1588 


0.1458 


0.1247 


0.1331 


H179 


0.148 


0.1985 


0.1322 


0.1301 


0.1846 


0.1523 


0.3181 


0.1354 


C238 


0.0959 


0.1383 


0.0726 


0.0619 


0.1311 


0.0893 


0.0809 


0.0814 


C242 


0.1733 


0.1553 


0.1042 


0.0978 


0.1578 


0.1367 


0.1539 


0.1983 



doi:l 0.1 371 /journal.pone.Ol 04242.t007 
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Figure 7. Solvent-accessible surface area (SASA) of WT and MT versus time during A) Apo B) Holo simulations for p53C. Black: WT, 
red: R110P, green: P151T and blue: P278A. 
doi:1 0.1 371 /journal.pone.01 04242.g007 









Figure 8. Radius of gyration of Ca atoms during a 10-ns MDS 
for WT and WIT p53C versus time. A), B), C) represent apo 
simulation D), E), F) represent holo simulation. Black: WT, red: R110P, 
green: P151T and blue: P278A. 
doi:1 0.1 371 /journal.pone.01 04242.g008 



Figure 9. Radius of gyration of Protein during a 10-ns MDS for 
WT and WIT p53C versus time. A), B), C) represent apo simulation D), 
E), F) represent holo simulation. Black: WT, red: R1 10P, green: P151T and 
blue: P278A. 

doi:1 0.1 371/journal.pone.Ol 04242.g009 
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Figure 10. Average number of protein-solvent intermolecular hydrogen bonds in WT and MT p53C versus time. A), B), C) represent 
apo simulation D), E), F) represent holo simulation. Black: WT, red: R110P, green: P151T and blue: P278A. 
doi:1 0.1 371/journal.pone.01 04242.g01 0 



Further, comparison of the regional flexibilities of the MTs 
showed a characteristic increase and decrease of the flexibility in 
certain loops, helices and P-sheets (Table 6). Strand SIO showed a 
consistently low fluctuation across all the simulations whereas 
higher fluctuations around the Zn^"^ binding residues, CI 76, 
HI 79, C238 and C242 were observed in apo WT and MTs 
compared to the holo WT simulations (Table 7). Loop regions on 
the other hand showed a larger fluctuation in both WT and MTs. 
Changes in the loops L3, LI 1, LI 2 and H2 helix contributed to a 
higher value of Ca-RMSF in the MTs with a larger portion of the 
L3 loop, SIO strand and H2 helix shifted far from its starting 
position (Fig. 6a). Compared to the WT simulations, the 
fluctuations observed were noticeably increased around the loops 
L3 and L7 in the RllOP (Fig. 6b) whereas in P 1,5 IT noticeable 
increase in fluctuations were observed around the loops L4 and L7 
(Fig. 6c). P278A on the other hand showed an increase in 
fluctuations at the loops L3, L7, Lll and H2 helix regions 
(Fig. 6d). Results from the analysis of regional flexibilities indicate 
that aU the three MTs Rl lOP, P151T and P278A will affect the 
overall flexibility of p53C. 

SASA is a geometric measure of the extent to which an amino 
acid interacts with its environment (the solvent and the protein 
core). It is naturally proportional to the degree to which an amino 
acid is exposed to these environments [35]. A rise or fall in the 
SASA designates the change in exposed amino acid residues 
thereby affecting the tertiary structure of a protein. Results from 
the analysis of SASA for apo and holo simulations showed a 
variation among the WT and MTs (Fig. 7a-b). MTs (apo; 



R110P:117.5916, P151T:1 18.8266, P278:l 18.7204, holo; 
R110P:118.1768, P151T:1 18.3956, P278A: 1 18.7466) showed a 
lesser average total SASA compared to the WT (apo; 119.7036, 
holo; 1 18.8847). Rg on the other hand, is a parameter to describe 
the equilibrium conformation of a total system particularly in 
analyzing the proteins it is an indicative of the level of compaction 
in the structure, i.e. how the polypeptide chain is folded or 
unfolded [36]. Rg plot for Ca atoms and protein with time over 
the course of 10 ns simulations during apo and holo simulations is 
shown in the Fig. 8a-f and Fig. 9a-f In the Rg plot for both Ca 
atoms and protein, we observed a notable fluctuation in MTs 
compared to the WT. Among the MTs, large deviations in Rg 
from the WT structure were observed during the apo and holo 
simulations of RllOP (Table 4). These results indicate that 
compared to other MTs, p53C might have undergone a significant 
structural transition due to RllOP. 

Further, one of the factors that accounts for maintaining the 
stable conformation of a protein is hydrogen bonding. In order to 
understand the reason for flexibility among the MTs we have 
performed the NH bond analysis of WT and MTs during apo and 
holo simulations and plotted in the Fig. lOa-f Results showed a 
notable difference in protein-solvent intermolecular hydrogen 
bond pattern between the WT and MTs. Among the MTs, a 
decrease in the average number of hydrogen bonds was observed 
in RllOP compared to the WT during both apo and holo 
simulations (Fig. 10a,d) indicting that the occurrence of this 
mutation may lead to a more flexible conformation in the presence 
or absence of Zn'^ at physiological conditions. However, the other 
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Figure 1 1 . Number density plot of p53C. A) Apo WT B) Apo R1 10P C) Apo PI 51T D) Apo P278A E) Holo WT F) Holo R1 10P G) Holo PI 51T F) Holo 
P278A. 

doi:1 0.1 371 /journal.pone.01 04242.g01 1 

MTs, P151T and P278A showed a decrease in average number of flexible only in the presence of Zn^""" at physiological conditions, 
protein-solvent intermolecular hydrogen bonds only during holo Further, we have plotted the atom density distribution to check if 
simulations (Fig. 10e,f) indicating that these two mutants are the MTs have caused any major changes in the orientation and 
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Eigenvector index Eigenvectorindex 



Figure 12. Plot of eigenvalues corresponding to eigenvector index for the first fifty modes of motion of p53C. A) represents the apo 
simulation B) represents the holo simulation. Black: WT, red: R110P, green: P151T and blue: P278A. 
doi:1 0.1 371 /journal.pone.01 04242.g01 2 
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Figure 13. Projection of the motion of the p53C in the phase space along the first two principal eigenvectors. A), C), E) represents the 
apo simulation B), D), F) represents the holo simulation. Black: WT, red: R110P, green: P151T and blue: P278A. 
doi:1 0.1 371 /journal.pone.01 04242.g01 3 
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atomic distribution. Results showed that atomic: distribution of all 
the MTs were significantly differed from the WT in apo and holo 
simulations (Fig. 1 1 a-h) indicating that all the MTs have a 
deleterious effect on the p53C. 

Moreover, to identify the correlated motions of the WT and 
MTs during trajectory generated by apo and holo simulations and 
to support our MDS result we performed ED analysis. Since sum 
of the eigenvalues is a measure of the total motility in the system, 
we have plotted the eigenvalues against the corresponding 
eigenvector index for the first ten modes of motion at difiFerent 
trajectory lengths for WT and MTs during the apo and holo 
simulations in the Fig. 12 a, b. Only few eigenvectors showed large 
eigenvalues for both WT and MTs during the apo and holo 
simulations indicating that most of the internal motion of the 
protein is confined along small dimension in the essential 
subspace. The spectrum of eigenvalues in the Fig. 12 a,b indicated 
that major fluctuations of the system were confined to first two 
eigenvectors. Hence, the projection of trajectories of WT and MTs 
during the apo and holo simulations in the phase space along the 
first two principal components (PCI, PC2) at 300 K was plotted in 
the Fig. 13 a-f. Compared to apo simulation, during holo 
simulation MTs covered a larger region of phase space along 
PCI and PC 2 plane than WT. The overall flexibihty of WT and 
MTs was calculated by the trace of the diagonalized covariance 
matrix of the Ca atomic positional fluctuations. Results from the 
trace of the covariance matrix (Table 4) confirmed the overall 
flexibility between MTs and WT at 300K during both apo and 
holo simulations. Overall the results reported from our study has 
confirmed that the substitution of Arginine at 1 1 0* residue with 
Proline, Proline at 151* residue with Threonine and Proline at 
278* residue with Alanine in the p53 core domain in the presence 
or absence of Zn^"*" has an altered structure stability and essential 
hydrogen bond formation and thus these three mutants might play 
a significant role in initiating the susceptibility towards breast 
cancer. Further, our analysis indicates that compared to other 
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